Skip to main content

Ollama


This guide describes how to install, update, configure, and uninstall Ollama on NVIDIA Jetson Orin devices. Ollama enables local inference for large language models (LLMs) with CUDA acceleration, and is optimized specifically for Jetson hardware.


1.Overview

  • Fast local LLM inference
  • CUDA acceleration support
  • Built-in model version management
  • Simple CLI tool with optional WebUI

This guide covers:

  • Installation via script or Docker
  • Running models
  • Updating Ollama and models
  • Optional remote access setup
  • Complete uninstallation procedure

overview


2. System Requirements

Hardware

ComponentMinimum Requirement
DeviceJetson Orin Nano / NX
Memory≥ 8GB (for running small to medium models)
Storage≥ 10GB (for model and cache storage)

Software

  • Ubuntu 20.04 or 22.04(based on JetPack)
  • JetPack 5.1.1+ (includes CUDA, cuDNN, TensorRT)
  • Python 3.8+ (optional)
  • Docker(optional, for containerized deployment)

3. Installing Ollama

Run the official installation script:

curl -fsSL https://ollama.com/install.sh | sh
  • Installs the CLI binary and background service.
  • CUDA support is enabled by default on Jetson devices.

Method B: Docker-Based Installation (Optional)

sudo docker run --runtime nvidia --rm --network=host \
-v ~/ollama:/ollama \
-e OLLAMA_MODELS=/ollama \
dustynv/ollama:r36.4.0

🧩 This Docker image is maintained by Jetson community contributor dustynv, optimized for JetPack environments.


4.Usage

Common Commands

ollama serve         # Start the Ollama background service  
ollama run # Run a model
ollama pull # Download a model from the registry
ollama list # List installed models
ollama show # Display model information
ollama rm # Remove a model
ollama help # Show help menu

Check Version

ollama -v
# Sample:ollama version 0.5.7

Start the Service (If Not Auto-Started)

ollama serve &

5. (Optional) Enable Remote Access

To allow external devices to access the Ollama service:

1.Edit the systemd service file:

sudo nano /etc/systemd/system/ollama.service
  1. Add the following lines under the [Service] section:

    Environment="OLLAMA_HOST=0.0.0.0"
    Environment="OLLAMA_ORIGINS=*"
  2. Reload and restart the service:

    sudo systemctl daemon-reload
    sudo systemctl restart ollama

6. Running

Use the ollama run command to start model inference:

ollama run deepseek-r1:7b
  • More available models refer to:https://ollama.com/search
  • The model will be downloaded on first run and cached locally for future use.

7. Update

Update to the Latest Version:

curl -fsSL https://ollama.com/install.sh | sh

(Optional) Install a Specific Version

To install a specific version, specify the version number like this:

curl -fsSL https://ollama.com/install.sh | OLLAMA_VERSION=0.1.32 sh

8. Uninstall

Stop and Remove the System Service

sudo systemctl stop ollama
sudo systemctl disable ollama
sudo rm /etc/systemd/system/ollama.service

Remove the Executable

sudo rm $(which ollama)

(Note: Ollama is typically installed in/usr/local/bin/usr/bin or /bin

Delete Model Files and User Account

sudo rm -r /usr/share/ollama
sudo userdel ollama
sudo groupdel ollama

9. Troubleshooting

IssueSolution
Port 11434 not respondingRestartollama serve or reload the system service
Installation failedEnsure curl is installed and you have internet access; try using sudo
Unable to uninstall OllamaollamaUse which ollama to locate the actual path, then delete it manually
Out of Memory (OOM) errorTry using a smaller model (e.g., 1.5b or 7b),or add swap space

10. Appendix

Path References

PurposePath
Ollama executable/usr/local/bin/ollama
Model cache~/ollama/ or/usr/share/ollama
Service configuration/etc/systemd/system/ollama.service

References